A ghost moth olfactory prototype of the lepidopteran sex communication

Abstract Sex role differentiation is a widespread phenomenon. Sex pheromones are often associated with sex roles and convey sex-specific information. In Lepidoptera, females release sex pheromones to attract males, which evolve sophisticated olfactory structures to relay pheromone signals. However, in some primitive moths, sex role differentiation becomes diverged. Here, we introduce the chromosome-level genome assembly from ancestral Himalaya ghost moths, revealing a unique olfactory evolution pattern and sex role parity among Lepidoptera. These olfactory structures of the ghost moths are characterized by a dense population of trichoid sensilla, both larger male and female antennal entry parts of brains, compared to the evolutionary later Lepidoptera. Furthermore, a unique tandem of 34 odorant receptor 19 homologs in Thitarodes xiaojinensis (TxiaOr19) has been identified, which presents overlapped motifs with pheromone receptors (PRs). Interestingly, the expanded TxiaOr19 was predicted to have unconventional tuning patterns compared to canonical PRs, with nonsexual dimorphic olfactory neuropils discovered, which contributes to the observed equal sex roles in Thitarodes adults. Additionally, transposable element activity bursts have provided traceable loci landscapes where parallel diversifications occurred between TxiaOr19 and PRs, indicating that the Or19 homolog expansions were diversified to PRs during evolution and thus established the classic sex roles in higher moths. This study elucidates an olfactory prototype of intermediate sex communication from Himalaya ghost moths.


Introduction
Sexual dimorphism is ubiquitous across the animal kingdom [ 1 ].For most animals, mating by partner allocation is an indispensable process to ensure population continuity [ 2 ].Sex roles often form under the pr essur e of sexual selection [ 3 ].In general, the female exhibiting greater parental investment becomes a limiting resource for the less caring male so that the latter competes for accessing to the former [ 4 ].In insects, sex pheromone becomes an effectiv e inv estment for the male to gain opportunities to mate with the female successfully.One well-studied example is that, in fruit fly Drosophila melanogaster , males typically release a specific pheromone called cis -11-vaccenyl acetate (cVA) to gain an adv anta ge in mating [ 5 ].Ho w e v er, sex r oles a ppear to be r e v ersed in moths [ 6 ].Female moths invest in synthesizing and releasing sex pheromones to attract male moths, and males hav e e volv ed distinct structures for sensing pheromones [ 7 , 8 ].T herefore , the study of pheromone and pheromone perception can expand our understanding of the evolution of sexual roles in animals.
One w ell-kno wn animal linea ge that r elies on pher omone comm unication is Lepidopter a, comprising nearl y 160,000 extant species and forming a k e y br anc h of insects [ 9 ].Lepidopter a pher omones wer e well studied in the past decade, and most can be classified into type 0, I, II, and III, according to their hydr ocarbon c hains , double-bond allocations , and terminal functional groups [ 10 ].Among them, type I pheromones, consisting of str aight-c hain acetates , alcohols , or aldehydes with 10 to 18 carbon atoms, make up 75% of all known sex pheromones and are emplo y ed b y most moth families [ 11 ].Pher omones ar e detected by pheromone receptors (PRs)/odorant receptor coreceptors (ORco) on the dendrites of olfactory sensory neurons.Based on the pheromone types, the corresponding PR family can be classified into type 0, I, and II clades [ 12 ].Ho w e v er, the r ecent discov ery of Lampronia capitella odorant receptor (OR) 6/ORco and Spodoptera littoralis OR5/ORco has r e v ealed a novel "PR clade" that is distant from the type I PR clade [ 13 , 14 ].This implies that the mechanisms underlying the evolutionary process of ORs for detecting pheromones in Lepidoptera need to be explored.
The neural architectures of pheromone perception appear to be conserved in moths [ 15 ].A typical perception of type I pheromone is ac hie v ed thr ough a label-lined olfactory coding pattern in higher moths, such as Noctuidae.Pheromones are tuned by olfactory sensory neurons housed in sensilla trichoidae on the antennae.After the PR/ORco complex has been activated by the corr esponding pher omone, the potential signals ar e pr ojected to the primary olfactory center, the antennal lobe [ 16 ].In Lepidoptera, the antennal lobe shows obvious sexual dimorphism.The malespecific macroglomerular complex (MGC) locates at the entry of the antenna and mainly processes pheromone signals [ 17 ]; in addition, in some species (e.g., Cydia pomonella and Bombyx mori ), it also responds to plant volatiles [ 18 , 19 ].The counterparts of the MGC in females are usually called the large female glomeruli (LFG) that process oviposition and host-choosing signals, but LFG glomeruli are not generally enlarged in size as the MGC [ 20 , 21 ].
The ghost moths (He pialoidae: He pialidae) from Exoporia are primiti ve Le pidoptera species and form an especially interesting lineage for studying the evolution of sex roles and pheromone communication [ 22 ].Hepialids represent an early branch from the line leading to the heter oneur an Ditrysia, and the latter includes almost all the lepidopteran species that use typical PRbased olfaction for pher omones.Notabl y, the sex r oles of Hepialidae species show diversity; for example, Hepialus hecta and Hepialus humuli exhibit courtship behavior that is v ery differ ent fr om the usual moth pattern, as males hover in groups to attract females [ 22 ].Mor eov er, ghost moths hav e under gone asymmetrical div er gence of a duplicated gene (e.g., zen gene) to deliver functional alterations in subsequent species , pro viding insights into the evolutionary process of Lepidoptera [ 23 ].Thitarodes , Ahamus , and Hepialus ghost moths, as the hosts of Ophiocordyceps sinensis medicinal fungus, are endemic to the Qinghai-Tibet Plateau [ 24 ].The isolated ecological habitat and prolonged life cycle of these so-called Himalaya ghost moths provide ideal opportunities for the retention of pheromone rece pti ve characteristics from shared ancestors of Lepidoptera [25][26][27].While most pr e vious r esearc h has focused on mating behaviors and pheromone identifications in hepialids [ 22 , 28-34 ], few studies have explored their potential pher omone-sensing neur al arc hitectur e or annotated the odor ant r eceptor famil y in Hepialidae species.
In this study, we presented the unique evolutionary position of olfaction in ghost moths, c har acterized by compar ativ e neur ology and phylogenomics.Our results demonstrated that the antennal lobes of both male and female of 3 Himalaya ghost moth species ( Ahamus jianchuanensis , Thitarodes armoricanus , Thitarodes xiaojinensis ) have an enlarged antennal entry part and lack obvious sexual dimor phism, when compar ed to evolutionary later Lepidopterans.
Compar ativ e genomics further r e v ealed that the ghost moth T. xiaojinensis expanded a specific Or19 tandem array instead of the classic type I PR clade .Beha vioral tests indicated that the ghost moth T. xiaojinensis exhibits similar sex roles between males and females in courtship, possibly due to their pheromonal neural arc hitectur es without sexual dimorphism and the specific Or19 tandem array.In summary, this study uncovers a mechanism for the occurrence of functional ORs such as PRs through asymmetric div er gence in Lepidoptera.

Chromosome-level genome assembly of Himalaya ghost moth
A T. xiaojinensis (NCBI:txid1589740) larva was sequenced using Nanopor e long-r ead tec hnology, r esulting in 319.9 Gb of clean reads .T he draft genome of 3.1 Gb, comprising 1,645 contigs with a contig N50 of 5.4 Mb, was assembled using NextDenovo, corr ected with minima p2 and NextPolish, and r efined by r emoving contaminants.Utilizing Hi-C interaction data, the primary assembly was divided into 31,434 contigs, and 31,391 contigs (99.86% in length) wer e anc hor ed to 32 c hr omosomes ( Supplementary Fig. S1 ).B USCO anal ysis r e v ealed 91.8% complete genes in the final c hr omosome-le v el genome assembl y, whic h was subsequentl y emplo y ed in downstream analysis.
The genome of T. armoricanus (NCBI:txid92013) was sequenced on Illumina HiSeq2000, yielding 877.7 Gb clean data for scaffolding.The final assembly presented a total length of 3,168 Mb for the scaffolds, with N50 of 27.8 kb and 176.2 kb for contigs and scaffolds, r espectiv el y.B USCO anal ysis indicates 90% of single-copy insect orthologs were complete.We also conducted B USCO anal ysis to w ar d the tr anscriptomes of A. jianc huanensis (NCBI:txid92022), and a completeness of 96.6% was observ ed.Genome and tr anscriptomes wer e subsequentl y emplo y ed in downstream analysis.

Evolutionary position of ancient Himalaya ghost moths based on phylogenomics analysis
We carried out a phylogenomics analysis based on genomes and transcriptomes of 3 ghost moth species, together with 13 Lepi-doptera and 2 outgroups (data source see Supplementary Table S1 ).The separation of exoporian and ditrysian Lepidoptera occurred by the end of the Triassic Period, around 205 million years ago, while the speciation of the Thitarodes moths soon follo w ed A. jianc huanensis , a ppr oximatel y 26 million y ears ago, b y the end of the Paleogene Period.Although Ahamus and Thitarodes r epr esented an ancient moth lineage, the species within were diverged in parallel with higher moths (Fig. 1 A).We next asked what olfactory traits were maintained during the evolution of Himalaya ghost moths.

Nonsexual dimorphic shortened antennae and enlarged glomeruli of Himalaya ghost moths
T he Himala ya ghost moths had no observ able pr oboscis but possessed intact antennae and labial palps (Fig. 1 B).Additionally, we found that these moths had the shortest antennae among 32 lepidopteran families [ 35 ] ( Supplementary Fig. S2 ), and their antennae were dominated by sensilla trichoidae (Fig. 1 C, Supplementary Fig. S3 ).
The antennal lobe morphological atlas sho w ed that 3 ghost moth species ov er all pr esented significantl y less glomeruli (23 to 37), compared with other moths (45 to 80; Fig. 1 D, Supplementary Data S1 ).Among 96 tested brains, the ordinary glomeruli arrangements in Himalaya ghost moths were distinguishable from those in the compared species .T he families of Hepialidae, Pieridae, and Plutellidae had less intraspecies variations in glomerular arrangements compared to the other 4 higher moth families ( Supplementary Fig. S4 ).The MGC consisted of 2 to 3 glomeruli in Himala ya ghost moths , and identical ar eas wer e confirmed in all tested species .F emale LFGs was distinguishable in earlier species, including the Himalaya ghost moths and the diamondback moth Plutella xylostella , compared with the later species (Fig. 1 D).
Volume proportions of the MGC and LFG glomeruli across tested species wer e compar ed.It showed that Himalaya ghost moths had both the largest MGCs and LFGs ( Supplementary Fig. S5 ).T he cumulus , which represents a major area, occupied 23.8% ± 5.5% of the antennal lobes in A. jianchuanensis , 15.9% ± 2.0% in T. armoricanus , and 19.5% ± 3.0% in T. xiaojinensis , respectiv el y ( Supplementary Fig. S5 , Supplementary Data S1 ).The other species had r elativ el y smaller MGC glomeruli in volumes (e.g., on av er a ge, 9.8% + 0.7% of the cumulus occupation for Noctuidae).As for females, LFG1 in 3 ghost moth species were significantly larger than those within higher moths ( Supplementary Fig. S5 , Supplementary Data S1 ).We c hec ked the MGCs of 16 species in terms of volumes and shapes by utilizing a principal component anal ysis test, cov ering 73.8% for explained v ariances.Twelv e of 16 lepidopteran species had a similar trend in MGC organizations, but the Himalaya ghost moths and H. cunea exhibited separated patterns, and especially T. xiaojinensis was totally isolated from higher Lepidoptera ( Supplementary Fig. S6 ).We wonder if these structural specificities may reflect the genomic backgrounds and r eceptor r epertoir es in the Himalaya ghost moths.

A unique large OR array on the ghost moth T. xiaojinensis chromosome
A total of 23 TxiaOrs were confirmed to be expressed in the antennae of T. xiaojinensis via r e v erse tr anscription PCR v erification, out of annotations from genome and antennal transcriptome assembly ( Supplementary Fig. S7 , Supplementary Data S2 ).This number was higher than those found in A. jianchuanensis (10) and T. armoricanus (16) ( Supplementary Table S2 ).Notably, a large array comprising 34 tandem duplications was mapped by higher moth PRs on chr14 of the chromosome-level assembly of T. xiaojinensis .T his arra y, homologous to TxiaOr1 9, contained 16 homologs ( TxiatdOrs ) and 18 pseudogenes ( TxiatdpOrs ), which maintained the largest tandem duplications reported in lepidopterans ( Supplementary Fig. S8 , Supplementary Data S2 ).The TxiaOr19 array was located on the same c hr omosome with an upstream TxiaOr18c , whic h ma pped to Noctuidae homologs [ 36 ].Maxim um likelihood phylogen y anal ysis using 387 ORs sho w ed that the Tx-iaOr19 array joined an earlier group in which canonical type I PRs arose (Fig. 2 A, Supplementary Data S3 ).The earlier separation of canonical PRs and TxiaOR19 tandem was also observed when cr oss-c hec ked with the neighbor-joining and Bayes method ( Supplementary Fig. S9 ).A female-biased expression pattern was observed in TxiatdOr15 and TxiatdOr25 of the 16 TxiaOr19 homologs ( Supplementary Fig. S10 ).Differing from ORs found in evolutionarily later moths, the TxiaOR19 array could blast to ORs in locusts , aphids , soldier flies , mosquitoes , and fleas , with homologs predicted by CLAN [ 37 ] (Fig. 2 A and Supplementary Figs.S9 , S11 ).Ho w e v er, the male-biased TxiaOR7 failed to blast to any ORs from those earlier species (Supplementary Figs.S11 , S12 ).
We investigated the evolution of the TxiaOr19 array by mapping it to c hr omosomes of caddisflies, as well as primitive and higher moths, and cr oss-c hec king with canonical PR ma pped r egions (Fig. 2 B).The r esults indicated that tandem ORs wer e identified in linearized regions of almost all species, except for C. flavipennella and B. mori .Some of these linearized regions did not contain ORs that could be annotated using FGENESH [ 38 ].Mapped ORs from linearization analysis and nr blast ORs by TxiaOR19 tandem were used in a Bayesian phylogenic anal ysis, whic h sho w ed that canonical PRs and a single LarmPR1 in the caddisfly formed a clade in the phylogeny (Fig. 2 A, Supplementary Fig. S13 ), and expansions of PR tandems were consistently observed within species after the Himalaya ghost moths, except for C. flavipennella (Fig. 2 B, Supplementary Fig. S13 ).Notabl y, r egions containing TxiaOr19 and PR had mixed tandem patterns in the early moths succeeding T. xiaojinensis but tended to be separated in later species (Fig. 2 B).Bayesian phylogenetic analysis showed that the homologs of Tx-iaOr19 underwent significant diversification from their ancestors ( Supplementary Fig. S13 ).Specifically, TxiaOr19 array formed the same clade with LarmOr19-Or13a tandem (Fig. 2 A, Supplementary Fig. S13 ).The majority of PR-mapped ORs formed a single cluster that possibly diverged from LarmPR1, with a few remaining in the TxiaOR19 mapped phyla (Fig. 2 A, Supplementary Fig. S13 ).
Using MEME suite, we found that LarmPR1 exhibited all 3 signature motifs of PR consensus regions as reported before [ 12 ], indicating the potential emergence of canonical PRs prior to the evolution of lepidopteran insects (Fig. 2 C).Ho w ever, most PR-mapped ORs from non-Ditrysia primitive moths did not meet the requirements of canonical PR motifs (Fig. 2 C).On the other hand, the majority of TxiaOR19-mapped ORs had 2 motifs, with some having 3 motifs (named motifs 4-6) upon c hec king with the same a ppr oac h (Fig. 2 D).Notably, motifs 4-6 from TxiaOR19 homologs sho w ed ov erla ps with motifs 2-3 from PR homologs, suggesting the existence of a common ancestor for TxiaOR19 and canonical PRs at an earlier evolutionary stage (Fig. 2 E).

Transposable elements involved in the evolution of ORs
Tandem gene duplications and c hr omosome linearization patterns have been reported to be associated with transposable element (TE) activities [ 39 ].To explore how a large and specific OR array was formed in the ghost moth T. xiaojinensis , we c har acterized F igure 2: Ev olution of sex pher omone-r elated odor ant r ece ptors among T. xiaojinensis , cad disfly, and other Lepidoptera.(A) Rooted maximum likelihood (ML) tree of 387 selected lepidopteran ORs, which included reported ORs of moths, mapped ORs in the linearization analysis in (B) by TxiaOR19 tandem, and ORs obtained from prelepidopteran species by blasting with TxiaOR19 tandem against the nr database ( Supplementary Data S4 ).The evolutionary distances were computed using the "Auto" option in IQ-TREE [ 68 ] with ultrafast   (A) Detailed TE landscapes of the ghost moths , caddisflies , and white-barred gold.Times of TE burst events were estimated according to CpG-adjusted Kimura substitution levels and a reported arthropod substitution rate of 6.19 × 10 −10 per site per generation [ 76 ].Red arrowheads indicate TE burst events.(B) Overview of asymmetric divergence of duplicated pher omone-r elated ORs fr om cad disfly to higher Le pidopter a.The TxiaOR18c/19 duplications pr edominate in cad disflies and primiti ve moths, but they were replaced by functional PR duplications in higher moths during evolution.the landscape of TEs in the genomes of above species.It sho w ed that 2-3 TE burst e v ents occurr ed in the Himala ya ghost moths .These bursts likely took place around the same time as the successi ve di vergences of Exoporia and Hepialus (Figs. 1 A and 3 A).Caddisflies and primitive and modern Lepidoptera experienced more recent bursts of TE activity compared to Himalaya ghost moths (Fig. 3 A, Supplementary Fig. S14 ).Specificall y, v arious TE arr angements wer e observ ed in pr e viousl y identified OR loci.The TE arrangements of the TxiaOr19 tandem were correlated with that of the LarmOr13a ( Supplementary Fig. S15 ).Furthermore, the TE landscapes in the TxiaOr18/19 homologs were found to be similar mostly with Ors from non-Ditrysia species and later diversified in higher moths ( Supplementary Fig. S15 ).
In conclusion, our findings suggest that both the TxiaOR19 tandem and PR clusters had already emerged in caddisflies .T he OR19 lineage underwent expansion within the ancestral moth lineage, leading to the formation of a large duplicated tandem in T. xiaojinensis .Furthermor e, linka ges between OR19 and PRs were ob-served in species predating Ditrysia.In later higher moths, PRs became predominant, while the OR19 cluster contracted through asymmetric diversifications of their homologs, as supported by the similar TE arrangements (Fig. 3 B).

Equal sex roles of ghost moth T. xiaojinensis adults
The replacement of the TxiaOR19 duplications with canonical PRs suggests possible different tuning characteristics of TxiaOR19 from a PR that narrowly tunes to sex pheromone components.To confirm this assumption, we first analyzed the emissions of adult T. xiaojinensis using both solvent extraction and solid-phase micr oextr action (SPME) methods.We found that male and female adults were not distinguishable by tracing the volatile blends in abdomen tip extractions.Ho w ever, SPME samples collected within the first 24 hours after female emergence exhibited a significant peak corresponding to oleamide ( Supplementary Fig. S16A ).We performed successiv e doc king sim ulations using the TxiaOR19 array and 4 sex-biased TxiaORs against the identified major components .T he results sho w ed that the TxiaOR19 array had less binding affinity to w ar d the panel of ghost moth emissions, and the responding spectrum was relatively broad for all 16 tandem ORs ( Supplementary Fig. S16B ).
Considering ghost moths, including T. xiaojinensis , lack sexual dimorphism in their antennal lobe (Fig. 1 D), we hypothesized that T. xiaojinensis may also exhibit unconventional calling and mating behaviors compared to higher moths.We then tested adult pairs of T. xiaojinensis in a courtship arena (Fig. 4 A).Unlike the female calling behaviors of higher moths with wing beats and extruded pheromone gland, calling behavior of this ghost moth involv ed hov ering with wing beats .F emales fluttered with substantial wing beats, while males fluttered by small-range vibrating-like wing beats (Fig. 4 B).Inter estingl y, male and female adults exhibited similar amounts of calling behaviors and tracing velocities (Fig. 4 B, C).Our results indicated that the sex roles of T. xiaojinensis adults differ ed fr om those of higher lepidopter ans during mating allocation, which may be attributed to the predicted unconventional TxiaOR19 tandem, noncanonical female emission, and olfactory arc hitectur al observ ations.

Discussion
Himalaya ghost moths offer a basal model for the study of olfaction evolution in Lepidoptera due to their unique olfactory system, limited distribution, and possession of the largest genomes in Lepidoptera.Our study has successfully generated the first c hr omosome-le v el assembl y for this linea ge, pr oviding v aluable insights into the genetic c har acteristics of these ghost moths.We ha ve disco vered that the OR evolutionary pathway in ghost moth T. xiaojinensis parallels with that of modern moths and identified molecular traces that reveal the origins of the modern pheromone sensory system.Inter estingl y, the pr esence of unique olfactory structur es, including enlar ged glomeruli observ ed in both sexes and dominated long tric hoid sensilla, mostl y nonbiased expressions of TxiaOr19 homologs, and their nonfeeding life traits in adulthood, suggests that these ghost moths may employ a primitiv e pher omone sensing system to locate potential partners.
Why the lineage of Himalaya ghost moths k ee ps primiti ve may be due to their isolated habitats, une v en long life cycle for larvae (3-6 years in nature), and a brief adult stage (several days) [ 26 , 27 ], whic h gr eatl y r educes the e volution speed of this lineage .T hey ha ve developed a unique evolutionary strategy of focusing solely on reproduction in adulthood while forgiving foraging [ 40 ].The redundancy of their giant genomes is an example of the basal genomic features possessed by the ghost moths [ 41 ].Asymmetricall y div er ged duplications and fr equent TE activity bursts ha ve pla yed a critical role in both PR formation and the emergence of other functional genes in Lepidoptera [ 23 ].As a result, the expanded OR19 homologs in primitive species were later diversified into canonical PRs, establishing the advanced sex pher omone-based comm unication system.One inter esting r esult in our molecular docking predictions is that the male-biased OR7 shows a higher affinity to oleamide, suggesting that TxiaOR7 potentiall y serv es as an ancestr al PR of the nov el PR clade in lepidopteran species.Ho w ever, more experimental functional evidence needs to be provided for both TxiaOR19 array and TxiaOR7 to support their evolutionary roles in lepidopteran species.
Canonical PRs of ancient moth species were broadly characterized [ 13 , 43 ].In this study, we showed that the TxiaOR19 array and LarmPR1 share motifs .T hese motifs were likely to be already separ ated befor e Lepidopter a and Tric hopter a div er ged fr om eac h other.Considering that motif shifts accompany similar transposable element arrangements within the tested OR loci, it is likely that the first PR emerged from exonization, driven by TE activities.This effect has been commonly observed in other organisms [ 44 ].On the other hand, PRs shared motif regions with TxiaOR19 and the later could be blasted to ORs in dipteran species, where large OR duplications existed, such as in Bactrocera dorsalis [ 45 ].The disa ppear ance of lar ge tandem arr ays in later lepidopter ans suggests the separation of ancestral duplicated ORs .T his can be supported by scatter ed c hr omosome linearization and incr eased DNA tr ansposons during TE activity bursts .T he duplicated ORs themselves could r eflect r a pid olfactory e volution for species ada ptation [ 46 ].Although TEs may not be determining factors for the functional emergence of ORs, as shown in the clonal raider ant Ooceraea biroi [ 47 ], we cannot exclude possible TE involvement in PR emergence due to the horizontally diverged TE landscapes among lepidopteran ORs.
Evolutionary later moths show larger interspecies variations and increased numbers of glomeruli in antennal lobes, suggesting potential positive selection in olfactory systems, along with their distribution in different ecological niches [ 48 ].The enlarged MGCs in the butterflies, such as Pieris rapae and Godyris zavaleta [ 49 ], suggest that the sexual dimorphic brain organizations are widel y pr esented by Lepidopter a, but the link between morphology and pher omone r ecognition in butterflies r equir es further examination.On the contrary, the sexual dimorphic olfactory neur opils ar e not suitable for Himalaya ghost moths, as females have also retained the enlarged LFGs.We argue that female LFGs are mor e likel y involv ed in mating allocation r ather than egg-laying orientation since these ghost moths spray eggs in nature, unlike the majority of modern moth egg-laying behavior in the selected locations [ 40 , 50 ].Ther efor e, they may not r equir e a sophisticated olfactory system for precise assessment of corresponding sites.This is also supported by our behavioral assays in the T. xia that both male and female were moving to find partners.Given their ecological tr aits, highl y r edundant genome, and a smaller number of ORs, we belie v e that these Himalayan ghost moths may retain pheromone-sensing abilities and exhibit primitive sex roles within Lepidopter a.Further inv estigation into the function of ORs remains crucial for refining this perspective.
Himalaya ghost moths retain ancestral traits of olfactory system for equal sex r oles, whic h may be attributed to the mechanisms of asymmetric div er gence and redundant genome formation.The lack of sexual dimorphism in the antennal lobes and expansion of TxiaOR19 array other than canonical PRs also contribute to the nonbiased sex role differentiation within this primitiv e linea ge.Ov er all, these findings highlight the unique e volutionary features of Himalaya ghost moths and shed light on the mec hanisms sha ping olfactory systems in insects.

Insects
Ne wl y emer ged lepidopter an species fr om lab colonies wer e sexed, and 3-to 5-day-old adults were used in all tests.A. jianchuanensis , T. armoricanus , T. xiaojinensis , Agrotis ipsilon , S. frugiperda , Galleria mellonella , and P. rapae were obtained from Institute of Zoology, Guangdong Academy of Sciences.H. cunea was obtained from Chinese Academy of Forestry.Athetis dissimilis was obtained from Henan University of Science and Technology.Helicoverpa armigera and Helicoverpa assulta were obtained from Henan Agricultural University.Mythimna separ ata , C. pomonella , and S. litur a were obtained from the Institute of Plant Protection, Chinese Academy of Agricultural Sciences.S. exigua was obtained from Qingdao Agricultur al Univ ersity .P .xylostella was obtained from Shanxi Agricultur al Univ ersity.All lab colonies wer e r egularl y r ejuv enated with natural populations.

Morphometric measurement
A total of 7 strains of Himalaya ghost moths wer e measur ed for the lengths of antennae and for e wings .T hree T. xiaojinensis strains wer e fr om the lab colon y mentioned abov e, as well as Xiaojin (N30.99,E102.27),Hongkou (N31.16,E103.84) field populations.The other 4 strains consisted of 2 A. jianchuanensis populations collected from Jiulong (N28.99,E101.51),Gongga (N29.56,E101.98) and 2 T. armoricanus populations from Yala (N30.11,E102.25),Kangding (N30.08,E101.97), r espectiv el y.Intact a ppenda ges wer e r emov ed and embedded on glass slides before being processed under an AXIO Imager microscope (Zeiss) equipped with an Axiocam 512 camera (Zeiss).A ZEN 2.3 software (Zeiss) was used to acquire scale bar labeled photographs of antennae and wings.Lengths of interest were manually assigned to the scale bar using ImageJ 1.53f51 (National Institutes of Health) and then recorded.A total 5 to 21 replicates were carried out for each strain, and means were used to develop olfactory indexes using the formula [antenna/wing].Data from other species were referred to the previous publication [ 35 ].

Scanning electron microscopy
The antennae of 1-to 3-day adults were cut from the base and fixed in 0.25% glutaraldehyde at 4 • C overnight.After 3 washes at r oom temper atur e with 0.1 M phosphate-buffer ed saline (PBS, pH 7.4) for 15 minutes each, the antennae wer e dehydr ated thr ough a laddered ethanol series (30%, 50%, 70%, 80%, 90%, and 100%) for 15 minutes each and dried for 15 minutes in a critical point drier (Bal-Tel CPD 030) before being mounted on aluminum stubs .T he mounted antennae were coated with gold spray (Bal-Tel SCD 005) and observed with an SEM instrument (FEI Quanta 200).

Antennal lobe atlas
Lepidopter an br ains wer e labeled according to pr e vious work [ 21 ].Ne wl y dissected intact brains were successively processed with 4% paraformaldehyde in 0.1 M PBS for fixation (24 h), preincubating with 5% normal goat serum in 0.1 M PBS containing 0.5% Triton X-100 (NGS-PBST) (0.5 hours), incubating with 1% SYNORF1 (Developmental Studies Hybridoma Bank, University of Io w a) in 5% NGS-PBST (72 hours), and incubating with Alexa Fluor 488 goat anti-mouse (Invitrogen) at 1:500 with 1% NGS-PBST (48 hours).After being rinsed 6 times in PBS and dehydrated with a laddered ethanol series, brain samples were mounted with antifade mounting medium (Beyotime) in perforated aluminum slides that was sandwiched by 2 glass co verslips .T hree brains of each sex from each species were prepared for imaging.
All ima ge stac ks wer e acquir ed with a confocal laser scanning microscopy system with a 10-20 × objective.Data for A. jianchuanensis , T .armoricanus , T .xiaojinensis , G. mellonella , and A. ipsilon were collected with FV3000 (Olympus).Data for H. cunea , A. dissimilis , H. armigera , H. assulta , M. separata , S. litura , and P. xylostella were collected with LSM 780 (Zeiss).Data for S. frugiperda , P. rapae , C. pomonella , and S. exigua were collected with A1 HD25 (Nikon).An argon laser at 488 nm was used to excite the Alexa Fluor.The resolution of the x-axis was set to 500-2,048 voxels and the section interval was set to 3 or 5 μm.Amira software (AMIRA 5.3; Visage Imaging) was used as previously described to conduct segmentation, tissue statistics, and 3-dimensional reconstructions of the antennal lobes [ 21 ].

Genome and transcriptome sequencing
Genomic DNA of T. xiaojinensis larva was extracted for library establishment and then sequenced with Nanopore PromethION platform (Oxford Nanopore Technology).After quality control, a total 319.9Gb clean data were assembled by using NextDeno vo ( RRID:SCR _ 025033 ).Meanwhile , a short-r eads libr ary was constructed by Illumina platform with the same batch of T. xiaojinensis DNA, and 165 Gb raw data were generated.After filtering, the remaining clean reads with Q > 20 were used for minimap2 mapping onto the genome assembl y, whic h was later polished by NextPolish ( RRID:SCR _ 025232 ).To r emov e the DNA contamination from the other organisms, the polished genome was aligned against the NCBI nucleotide (NT) database, and the contigs that were aligned to the sequences from fungi, plants, or virus were removed.
To obtain a c hr omosome-le v el assembl y, Hi-C scaffolding was further carried out with the same larval sample following reported pr otocols [51][52][53].Specificall y, samples wer e fixed using 2% formaldehyde to establish cross-links, follo w ed b y cell lysis and sample quality assessment thr ough extr action.Chr omatin digestion was carried out using a restriction endonuclease, with enzyme cleav a ge efficacy e v aluated thr ough sampling.Subsequent steps included biotin-14-dCTP (Invitrogen) labeling, bluntend ligation, DNA purification, and Hi-C sample pr epar ation.After passing quality contr ol, Hi-C fr a gments underwent end-biotin r emov al, sonication, end r epair, A-tailing, and ada pter ligation to form ligated products.Subsequent PCR steps were amplified to gener ate libr ary-enric hed pr oducts.Libr ary amplification pr oducts were sampled for Hi-C fragment junction quality control, and the entire library preparation was sequenced using Illumina HiSeq with a PE150 sequencing strategy (NextOmics Biotech).The fastp v.0.12.6 ( RRID:SCR _ 016962 ) with default parameters was used to filter the raw sequences, resulting in high-quality clean reads .T he sequenced Reads1 and Reads2 were separately aligned to the assembled genome sequence using bowtie2 v.2.3.2 (endto-end alignment mode, par ameters: -v ery-sensitiv e -L 30) ( RRID: SCR _ 016368 ) to obtain the alignment information.For the unma pped r eads after alignment, we searched for reads containing ligation junction sites, trimmed them, and performed alignment a gain.Finall y, the alignment r esults wer e combined, and the proportion of unique mapped paired-end reads was calculated.The LACHESIS softw are ( RRID:SCR _ 017644 ) w as used to cluster the contig sequences of the draft assembly into chromosome groups using a gglomer ativ e hier arc hical clustering.The final genome was further assessed with BUSCO [ 54 ] for completeness.
Genome of T. armoricanus was obtained from the DNA of a fourth instar larva without gut.A total of 23 different insert size libr aries wer e constructed and 67 lanes were sequenced on an Illumina HiSeq2000 platform ( RRID:SCR _ 020130 ), resulting in 1,344.5 Gb raw data and 877.7 Gb filtered data.The genome was assembled using SOAPdenovo ( RRID:SCR _ 010752 ) (v2.04) [ 55 ] and SS-PACE ( RRID:SCR _ 005056 ) (v2.0) [ 56 ] software.We used all 549.3 Gb (180.4 ×) clean data of short insert size libraries to construct contigs and all 877.7 Gb (266.4 ×) clean data to construct scaffolds.In total, 283.4 Gb (86.0 ×) data of large insert size libraries wer e used a gain to construct scaffolds by using SSPACE.Then, all clean data of short insert size libraries were used to fill the gaps.TrimDup3 (Rabbit2.6)[ 57 ] was used to r emov e the lar ge r edundant sequences.RNA sequencing data from 14 different dev elopmental sta ges of T. armoricanus wer e assembled by Trinity ( RRID:SCR _ 013048 ) v2.4.0 [ 58 ] and mapped to the assembled genome sequence using BLAT ( RRID:SCR _ 011919 ) (v. 34) [ 59 ], to c hec k the cov er a ge r ate.The r esults sho w ed that 96.8% of the sequences could be mapped to the assembly.
Respective antennae , heads , and labial palps fr om A. jianc huanensis and T. xiaojinensis were collected in liquid nitrogen and sequenced with Illumina according to the manufacturer's instructions .T he transcriptomes were assembled by Trinity v2.4.0 [ 58 ] with default parameters.

Phylogenetic analysis and estimation of di v ergence time
To reconstruct the phylogenetic tree of 16 lepidopteran insect species with 2 outgroups of Tribolium castaneum and Drosophila melanogaster , w e first do wnloaded the genome annotations or raw data of transcriptomes for other 15 species from NCBI ( Supplementary Table S1 ).The tr anscripts wer e assembled by Trinity v2.4.0 [ 58 ] with default par ameters.Subsequentl y, the orthologs of these 18 insect species were inferred from their genomic or transcriptomic protein annotations by using Or-thoFinder ( RRID:SCR _ 017118 ) [ 60 ] with the default parameters.Single-copy orthologs from each species were selected for phylogenetic reconstruction.The protein sequences of each ortholog wer e independentl y aligned with MAFFT ( RRID:SCR _ 011811 ) v7.407 [ 61 ], and the aligned results were trimmed by trimAl ( RRID:SCR _ 017334 ) [ 62 ] to r emov e low-quality regions with the parameter "-automated1," with the trimmed sequences concatenated into a single super sequence.RAxML ( RRID:SCR _ 006086 ) [ 63 ] was then used with the VT + F model, which is inferred by ProtTest ( RRID:SCR _ 014628 ) v3.4.2 [ 64 ], to estimate a maximum likelihood tree starting with 1,000 bootstraps follo w ed b y likelihood optimization.
We used r8s ( RRID:SCR _ 021161 ) (V1.7.1) [ 65 ] to estimate the div er gence time .T he phylogenetic tree constructed by RAxML [ 63 ] was used as an input tr ee.A smoothing par ameter of 3 was selected, which was estimated by the cr oss-v alidation a ppr oac h (with parameters "cvstart = 0, cvinc = 1, cvnum = 18").The calibration points were (i) the most recent common ancestor of the clade, including T. castaneum and P. xylostella , constrained to be 337 million years ago (Mya); (ii) the most recent common ancestor of the clade, including D. melanogaster and C. pomonella , constrained to be 318 Mya; and (iii) the most recent common ancestor of the clade, including P. rapae and S. litura , constrained to be 125 Mya [ 42 ].

Annotation of the Or gene family
The protein sequences of lepidopteran insect ORs were collected fr om NCBI.These pr otein sequences wer e then used as queries in iter ativ e TBLASTN searc hes with the par ameter "-e v alue 1e-5" against the assembly of the 3 ghost moth species to find candidate Or genes.A local command line HMMER ( RRID:SCR _ 005305 ) (version 3.1b2) [ 66 ] search was conducted for these candidate ORs against the Pfam-A database ( RRID:SCR _ 004726 ) to find the 7tm_6 (PF02949) or 7tm_4 (PF13853) HMM profiles for ORs.FGENESH 2.6 [ 38 ] prediction of potential genes was performed for contigs of inter ests.Data fr om other species wer e collected according to the reported works ( Supplementary Table S1 ).

Char acteriza tions of Ors
CDS cloning v erifications wer e carried out tar geting on annotated TxiaOrs using adult antennal cDNA.Gene-specific primers were designed ( Supplementary Table S3 ) and PCRs were done on a Veriti 96-well thermal cycler (Applied Biosystems) using High Fidelity (HiFi) PCR SuperMix (Tr ans).Pr oducts wer e pr ocessed with 1% a gar ose (BBI) on a Po w erP ac electr ophor esis system (Bio-Rad) and visualized with a GelDoc-It TS3315 imaging system (UVP).Multiple bands such as for TxiaOr18 were separately collected and purified with a gel extraction kit (GenStar) before Sanger sequencing (Sangon Biotech).Later analysis was based on the longest sequenced TxiaOrs for each locus.Or expressions were shown as autoscaled heatmaps indicating the FPKMs (fragments per kilobase of transcript per million mapped reads), which were calculated by RSEM [ 67 ] from head, antenna, and labial palp transcriptomes of adult ghost moths.
Phylogenetic analysis of 387 ORs was carried out with the abov ementioned pr otocol using MAFFT [ 61 ], trimAl [ 62 ], and IQ-TREE ( RRID:SCR _ 017254 ) [ 68 ] using the "Auto" option for model, with 1,000 ultrafast [ 69 ] bootstraps, as well as the Shimodaira-Hasegawa-like a ppr oximate likelihood ratio test [ 70 ].Verifications were done to the tree topology with MEGA X [ 71 ] and MrBayes ( RRID:SCR _ 012067 ) 3.2.6 [ 72 ] to establish the NJ tree based on the Dayhoff model and BY tree based on the Blosum62 model, respectiv el y.Homologs of the TxiaOR19 array were predicted using CLANS [ 37 ] using the blastx results against the NCBI nr database.For c hr omosome linearization tests, local tblastn was a pplied to ma p the selected ORs to w ar d c hr omosomes of eac h species ( Supplementary Table S1 ), and results were visualized as circos plots by using TBtools ( RRID:SCR _ 023018 ) v1.113 [ 73 ].Evolution of mapped ORs was inferred using MrBayes 3.2.6 [ 72 ] under the JTT + F + G4 model (2 parallel runs, 200,000 generations), in which the initial 25% of sampled data were discarded as burnin.The final av er a ge standard de viation of split fr equencies was 0.069772.Pr otein motifs wer e pr edicted using MEME Suite ( RRID: SCR _ 001783 ) v5.5.2 [ 74 ].

Annotation of repeats and transposable element families
For transposable element analysis, we first performed the de novo pr edictions for eac h species by RepeatModeler ( RRID:SCR _ 015027 ) version open-1.0.11 to generate a specific library.Then we annotated the genome assembly by Re peatMask er ( RRID:SCR _ 012954 ) version open-4.0.7 with the "ncbi" search algorithm.Annotated transposable element sequences were manually verified and classified with Dfam ( RRID:SCR _ 021168 ) [ 75 ].The calcDiv er genceFr o-mAlign.pland cr eateRepeatLandsca pe.pl scripts in the Repeat-Masker pac ka ge wer e used to calculate the Kim ur a div er gence values and plot the repeat landscape, respectively.Estimations for transposable element burst times were based on the recently reported substitution rate of 6.19 × 10 −10 per site per generation in arthropods [ 76 ].

Chemical analysis
Hexane extraction method was adopted from our previous works on moth pheromone identifications [ 77 ].Abdomen tips of calling adult T. xiaojinensis were cut with dissection scissors and immediately put in 20 μL hexane (HPLC purity; Kermel Chemical Reagent Co.), which was k e pt at 4 • C for 1 day prior to the test.Head space SPME method was adopted from our previous works on body surface volatile emissions of insects [ 78 ].Ne wl y emer ged male or female adults were k e pt in a mesh cage in separated rearing chambers for sampling.A 50/30-μm DVB/CAR/PDMS stableflex fiber (Supelco) was penetrated into the cage for sampling at 10 • C for 24 hours .T he v olatile blends sampled w ere either injected for 1 μL or subjected to an Agilent 7890B GC-5977 MSD coupled system equipped with an HP-5MS column (0.25 μm × 30 m × 0.250 mm) (Agilent).A 60-minute oven temperature program was used as follows: 40

Doc king sim ulation
TxiaOR19 and the other 4 sex-biased OR sequences of T. xiaojinensis wer e pr edicted by AlphaFold2 [ 79 ] for their tertiary structures.The 3D structures of 18 ligands were downloaded from PubChem [ 80 ].The Molecular Operating Environment software (MOE; Chemical Computing Group ULC) was used to dock the ligands with ORs.Briefly, ORs wer e pr epar ed using MOE Quic kPr ep and ligands wer e energy minimized with the MOE Energy Minimize prior to the simulation.Triangle Matcher algorithm was selected for placement and 30 top-scoring placement poses were selected by the London dG empirical scoring function, while the rigid receptor was selected for refinement and top-scoring poses were selected by the GBVI/WSA dG empirical scoring function.The binding free energy of r espectiv e OR-ligand w as estimated b y using S Score function and later used for establishment of a color-coded map.

Courtship arena
T he assa ys were carried out using 1-day emerged naive moths at peak mating hours 6 to 8 p.m. during sunset.One r andoml y chosen pair of T. xiaojinensis adults was placed in a paper funnel and recorded for 1 hour.A total of 20 pairs were tested and recorded for calling and tracing beha viors .Recor ded footages w ere pr ocessed thr ough the idTr ac ker [ 81 ] pipeline to obtain the v elocities of moths shown as per pixel distances per minute.Fluttering behaviors were observed by manually checking each video file.

Statistics and data processing
Comparison of means was done by using either unpaired t -test or the general linear model (GLM) follo w ed b y multiple comparisons according to treatment sizes (SPSS 22.0.0.0;IBM Corp.).Simple linear r egr ession and data plotting wer e done using Prism 5.01 (Gr a phP ad Softwar e).Multiv ariate tests wer e carried out with MetaboAnalyst ( RRID:SCR _ 015539 ) 5.0 [ 82 ] server, which integrates R statistics ( RRID:SCR _ 001905 ).All error bars indicate standard errors of the means unless otherwise indicated in the figure legends.

Figure 1 :
Figure 1: Phylogenomics and olfactory morphology of the ghost moths compared with other species in Lepidoptera.(A) Dated evolutionary tree of Lepidopter a r elationships.Two of the nonlepidopter an species wer e placed on outgr oup br anc hes including D. melanogaster and T. castaneum .The tree was inferred through a maximum likelihood analysis of 634,106 amino acid sites from 1,547 strict single-copy genes employing the VT + F model and 1,000 bootstr a p r eplicates.Br anc h lengths wer e optimized and node a ges estimated using the penalized likelihood methods with the truncated Newton algorithm in r8s [ 56 ].Scale bar is in millions of years.Data resources are listed in Supplementary Table S1 .(B) Adult head development of Hepialidae T. xiaojinensis compared to moth S. frugiperda and butterfly P. rapae .Orange arrow indicates the labial palp.Blue arrow indicates the pr oboscis, whic h is lacking in the ghost moth.(C) Antennal sensilla morphology of selected Lepidoptera by scanning electron microscope.(D) Glomerular counts of tested Lepidoptera observed by confocal laser scanning microscopy system.Red bars indicate predicted male MGCs, and green bars indicate female LFGs.Numbers indicate standard errors of means.Lo w er case letters indicate significant differences of glomerular counts among species (GLM and Tuk e y HSD, male: F 15, 32 = 48.2,P < 0.0001, female: F 15, 32 = 30.9,P < 0.0001).

Figure 3 :
Figure3: Genome and OR evolution reflected by landscapes of transposable elements (TEs).(A) Detailed TE landscapes of the ghost moths , caddisflies , and white-barred gold.Times of TE burst events were estimated according to CpG-adjusted Kimura substitution levels and a reported arthropod substitution rate of 6.19 × 10 −10 per site per generation[ 76 ].Red arrowheads indicate TE burst events.(B) Overview of asymmetric divergence of duplicated pher omone-r elated ORs fr om cad disfly to higher Le pidopter a.The TxiaOR18c/19 duplications pr edominate in cad disflies and primiti ve moths, but they were replaced by functional PR duplications in higher moths during evolution.

Figure 4 :
Figure 4: Resulting dual attraction in sex communications of the ghost moth adults.(A) Schematic shows setup of the courtship arena of T. xiaojinensis adults.(B) Comparison of calling r ates, whic h wer e r eflected by fluttering behaviors in both sexes (binary test a gainst e v en distribution, P = 0.33).(C) Left shows r epr esentativ e behavior al tr aces of male and female adults tr ac ked by idTr ac k er [ 81 ].Right shows comparison of distance per min ute between male and female T. xiaojinensis adults (Mann-Whitney test, U = 137, P = 0.8055).
• C for 2 minutes, 40 • C to 150 • C at 5 • C/min, 150 • C for 2 minutes, 150 • C to 200 • C at 10 • C/min, 200 • C for 5 minutes, 200 • C to 230 • C at 5 • C/min, and 230 • C for 18 minutes .Ra w data were analyzed with MSD ChemStation (G1701FA F. 01.03.2357) b y sear ching against an NIST 17 MS library (Agilent).A total of 40 individuals were tested for SPME from 2 stratified groups.Each hexane extraction sample included 20 individuals and at least 3 replicates were done toward each sex.